Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(rules): automatic refire with backoff #732

Merged
merged 5 commits into from
Dec 20, 2024

Conversation

andrewazores
Copy link
Member

@andrewazores andrewazores commented Dec 3, 2024

Welcome to Cryostat! 👋

Before contributing, make sure you have:

  • Read the contributing guidelines
  • Linked a relevant issue which this PR resolves
  • Linked any other relevant issues, PR's, or documentation, if any
  • Resolved all conflicts, if any
  • Rebased your branch PR on top of the latest upstream main branch
  • Attached at least one of the following labels to the PR: [chore, ci, docs, feat, fix, test]
  • Signed all commits using a GPG signature

To recreate commits with GPG signature git fetch upstream && git rebase --force --gpg-sign upstream/main


Fixes: #731

Description of the change:

Adds logic for automatic retry when an automated rule is activated but fails. This can happen because the discovered target application is not yet actually ready to accept requests, or because of some network hiccup, etc. Previously if an automated rule activation failed then Cryostat would never retry and the rule would not take effect on the target, until some other event caused a reactivation (rule disable/reenable, stored credentials added, target modified, target lost/found).

Motivation for the change:

Increases reliability of automated rules.

How to manually test:

  1. Deploy on k8s, ex. using Helm chart
  2. Deploy sample application. I used the Operator's make sample_app
  3. Create an Automated Rule that matches the sample application(s)
  4. Use various oc rollout restart deployment quarkus-test, oc scale deployment quarkus-test --replicas=n commands to cause targets to appear and disappear. Verify that automated rules consistently activate against these targets. There may be a slight delay in activation at times due to the retry/backoff behaviour.

@andrewazores
Copy link
Member Author

/build_test

Copy link

github-actions bot commented Dec 3, 2024

Workflow started at 12/3/2024, 3:22:10 PM. View Actions Run.

Copy link

github-actions bot commented Dec 3, 2024

No OpenAPI schema changes detected.

Copy link

github-actions bot commented Dec 3, 2024

No GraphQL schema changes detected.

Copy link

github-actions bot commented Dec 3, 2024

CI build and push: At least one test failed ❌
https://github.com/cryostatio/cryostat/actions/runs/12147480000

@andrewazores
Copy link
Member Author

/build_test

Copy link

github-actions bot commented Dec 5, 2024

Workflow started at 12/5/2024, 9:49:48 AM. View Actions Run.

Copy link

github-actions bot commented Dec 5, 2024

No GraphQL schema changes detected.

Copy link

github-actions bot commented Dec 5, 2024

No OpenAPI schema changes detected.

Copy link

github-actions bot commented Dec 5, 2024

CI build and push: All tests pass ✅
https://github.com/cryostatio/cryostat/actions/runs/12182170373

@andrewazores andrewazores marked this pull request as ready for review December 5, 2024 15:24
@andrewazores andrewazores requested a review from a team December 5, 2024 15:28
@andrewazores andrewazores merged commit d9fb5f8 into cryostatio:main Dec 20, 2024
7 of 8 checks passed
@andrewazores andrewazores deleted the rule-refire branch December 20, 2024 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Automated Rules do not re-fire if initial Target connection attempt fails
2 participants